Load forecasting method based on CEEMDAN and TCN-LSTM

Aiming at the problems of high stochasticity and volatility of power loads as well as the difficulty of accurate load forecasting, this paper proposes a power load forecasting method based on CEEMDAN (Completely Integrated Empirical Modal Decomposition) and TCN-LSTM (Temporal Convolutional Networks and Long-Short-Term Memory Networks). The method combines the decomposition of raw load data by CEEMDAN and the spatio-temporal modeling capability of TCN-LSTM model, aiming to improve the accuracy and stability of forecasting. First, the raw load data are decomposed into multiple linearly stable subsequences by CEEMDAN, and then the sample entropy is introduced to reorganize each subsequence. Then the reorganized sequences are used as inputs to the TCN-LSTM model to extract sequence features and perform training and prediction. The modeling prediction is carried out by selecting the electricity compliance data of New South Wales, Australia, and compared with the traditional prediction methods. The experimental results show that the algorithm proposed in this paper has higher accuracy and better prediction effect on load forecasting, which can provide a partial reference for electricity load forecasting methods.


Introduction
With the continuous expansion of the power system scale and the increasing power load, accurate power load forecasting has become more and more important [1].Power load forecasting is important for the dispatch and operation of power systems, as well as the basis for power market regulation, power supply and demand balance, and energy planning [2].Accurate forecasting of power load trends and changes is crucial for electric utilities and energy suppliers because they need to develop reasonable power dispatch and resource allocation strategies based on the forecast results to ensure the stability and reliability of the power system [3].In addition, accurate power load forecasting is also a key factor in facilitating renewable energy integration, energy trading markets and energy efficiency [4][5][6], which can improve economic and social benefits.
Traditional power load forecasting methods mainly include linear regression analysis, time series analysis, and BP neural network.However, these methods have their own shortcomings, such as insufficient adaptation to nonlinear relationships and low prediction accuracy and robustness [7,8].In recent years, with the rapid development and application of deep learning technology, power load forecasting methods based on deep learning have also received widespread attention.Among them, models such as Convolutional Neural Networks (CNN) and Long Short-Term Memory Networks (LSTM) have distinctive advantages and are widely used in time series forecasting.Feng [9] et al. used a convolutional neural network based approach to predict power loads and proved in experiments that it has better prediction performance than traditional methods.Mao [10] et al. proposed a new approach based on long and short term memory networks to capture long term dependencies in power load data.Jiang et al. [11,12] attempts to apply deep learning models to solve some challenging problems in power load forecasting.
In addition, some scholars have also proposed prediction models that combine sequence decomposition algorithms and machine learning algorithms.In the algorithm for time series decomposition, commonly used are wavelet transforms [13], Empirical Modal Decomposition [14] and Collective Empirical Modal Decomposition [15].The purpose of the models combining these decomposition algorithms and machine learning algorithms is to transform a nonlinear, nonsmooth time series into multiple, relatively smooth subsequences, and to improve the prediction of the entire model by predicting the multiple subsequences individually.While decomposition of a time series is not a necessary step for time series forecasting, but in forecasting studies in areas such as finance and wind speed [16,17], it has been widely used and achieved better prediction results.Zhang et al. [18] utilizes empirical modal decomposition to decompose the original load sequence into multiple intrinsic modal functions (IMFs), and then the IMFs are used as inputs to the LSTM model for load prediction, which effectively improves the prediction accuracy, and can show better performance especially in the case of nonlinear and nonsmooth load data.An improved short-term load forecasting model based on EMD and LSTM was proposed by Zhao et al. [19].The experimental results show that the model has significant effect in improving the prediction accuracy and robustness.Qin [20] proposed a combined model based on EMD and Extreme Learning Machine (ELM).The experimental results show that the proposed combined model can better adapt to nonlinear and nonsmooth load data and has high prediction accuracy.Luo [21] introduced a regional electric load forecasting method based on empirical modal decomposition and support vector regression (SVR).First, the original load sequence is decomposed into multiple IMF components using EMD, and then these components are utilized to forecast future loads, which are combined by the SVR model.The experimental results show that the proposed method can better adapt to nonlinear and nonsmooth load characteristics and improve the prediction accuracy.Huang [22] et al. proposed a power load forecasting method using empirical modal decomposition, least squares support vector regression (LS-SVR), and firefly algorithm, which greatly improves the accuracy and stability of the forecast.
In order to improve the prediction accuracy on nonlinear and unsteady power loads, this paper proposes a combined prediction model with adaptive complete modal empirical decomposition, temporal convolutional network, and long-short-term memory network (CEEM-DAN-TCN-LSTM).The method combines empirical modal decomposition and deep learning models, aiming to overcome the limitations of traditional deep learning methods and improve the accuracy and stability of power load forecasting.First, we use an empirical modal decomposition algorithm to decompose the raw power load data into multiple intrinsic modal functions (IMFs).Then, we take the decomposed IMFs as inputs and construct a TCN network to extract the spatio-temporal features of each IMF.In this paper, we introduce the attention mechanism in the TCN network, so that the TCN can better solve the interdependence between complex variables as a way to obtain obtain a more accurate feature representation.Next, we employ a long and short-term memory network to model the temporal dependencies of IMFs.Finally, we realized the integration of CEEMDAN and TCN-LSTM models by stacking and jointly training TCN and LSTM networks.Through the combination of multilevel and nonlinear structures, the prediction results obtained from each sequence are stacked to obtain the complete prediction value of the original sequence.In this paper, the combined CEEM-DAN-TCN-LSTM model is compared and experimented with traditional deep learning methods and classical time series models.The experimental results show that the method has better accuracy and stability in power load forecasting compared to traditional deep learning methods and time series models.

Empirical modal decomposition of complete ensembles of adaptive noise
CEEMDAN is an improved decomposition method based on EEMD, which is essentially an EMD decomposition of a signal several times and suppresses the generation of modal aliasing by superimposing Gaussian white noise to change the polar point characteristics of the signal.However, EEMD cannot completely remove the noise during the decomposition process, and there exists noise residue.To solve this problem, CEEMDAN introduces the concept of adaptive noise.In CEEMDAN, the variance of the noise is obtained by adaptive estimation of each IMF instead of a fixed Gaussian white noise [23].By incorporating adaptive white noise, CEEMDAN is able to completely separate the eigenmode functions and also reduces the reconstruction error.The CEEMDAN algorithm steps are as follows: (1) Add Gaussian white noise to the original signal sequence to obtain a new sequence as: x i ðtÞ ¼ xðtÞ þ εo i ðtÞ;i ¼ 1;2; . . .;n ð1Þ Formula: x(t) is the original signal; x j (t) is a new signal; ε is the signal-to-noise ratio between signal and noise; i is the number of times to add Gaussian white noise; ω i (t) is the added Gaussian white noise.
(2) Perform EMD decomposition on the resulting new signal sequence and average the first value of the decomposition as the first IMF component obtained by CEEMDAN decomposition.
Formula: C 1 (t) is the first IMF component produced by the CEEMDAN decomposition; The first residual component is then obtained: (3) Add Gaussian white noise to the first residual component signal obtained after decomposition and perform EMD decomposition.
Formula: C j (t) is the jth IMF component obtained from the CEEMDAN decomposition; E j−1 is the j − 1 th IMF component of the EMD decomposition; r j (t) is the residual component after the jth decomposition.
(4) Repeat the above steps until the residual component can no longer be decomposed, i.e., the CEEMDAN decomposition is finished.The original signal sequence is decomposed into several IMF components and a residual component as follows: Formula: C i (t) is the ith IMF component; r(t) is the residual component.

Sample entropy
Sample entropy is an algorithm that measures the degree of disorder and randomness of a sequence, and this algorithm is widely used for quantitative detection of sequences containing noise [24].The smaller value of sample entropy indicates the higher degree of disorder of the sequence, and vice versa indicates the better regularity of the time series, so it is suitable for the complexity analysis of the load data after CEEMDAN decomposition.Sample entropy, as an improved algorithm of approximate entropy, has better consistency effect and faster calculation speed, and its calculation process is as follows: (1) Reconstruct the original signal sequence of length N into a vector of dimension: (2) Calculate the distance between the lost vector and the other vectors: (3) Setting the threshold value r, get the number n ij (r) of d ij , calculate its ratio to the number of vectors N − m+1, denoted as It is worthwhile to take an average for it: (4) Increase the dimension to m+1 and repeat the above steps to get the mean: (5) When the length N of the time series is a finite value, define the sample entropy as:

Temporal convolutional network
Temporal Convolutional Network (TCN), which is improved based on Convolutional Neural Network (CNN), is a kind of neural network that can be used to deal with the structure of time series [25].Compared to the traditional CNN, the feedback and convergence speed of the TCN model is further improved, and the learning ability of the network is further enhanced.Causal dilation convolution is the core part of TCN network.In causal dilation convolution, the parameters such as convolution kernel, number of convolution layers and dilation coefficients can be adjusted to realize the feature extraction of the time series as a whole and obtain the temporal dependence.The network structure of TCN is shown in Fig 1.The causal dilation convolution is sampled with the dilation rate d as the base interval, and the bottom layer d = 1 denotes that the feature extraction is performed on the value of each input network; d = 2 denotes that the feature extraction is performed on the value of one of every two inputs.It can be seen that the dilation rate in TCN networks grows exponentially along with the increase in the number of layers, which makes the TCN network have a longer sensory field with only a few layers, thus outputting richer information features.
Compared with traditional CNN, TCN incorporates a residual linking module in model training, which aims to solve the problem of missing information that occurs when the number of TCN layers is high.When a convolutional operation is performed, this module works by superimposing the inputs and outputs to ensure that information is not lost.This module consists of two causally expanded convolutional layers and other related modules combined to form a residual module as the basic unit of the TCN network, the structure of which is shown in Fig 2.
In practical power load forecasting, the factors affecting power load include multiple variables such as ambient temperature, humidity, and electricity price.Therefore, this paper introduces the attention mechanism on the basis of TCN network to solve the complex dynamic dependency relationship between multiple variables.Its model structure is shown in Fig 3.
The first layer is the TCN layer, which captures the temporal information of the input multivariate time series and outputs the information representation ft at time t.The second layer is a CNN to perform feature extraction on the input temporal representation f t .The second layer performs feature extraction on the input information representation f t at time t through CNN.The input f t is reconstructed as a one-dimensional matrix F = {f 1 , f 2 , . .., f t−1 }, which is added to the CNN convolutional kernel for feature extraction, and the output is Formula: F C i;j is the convolution value of the ith row of vectors and the jth convolution kernel; C is the convolution kernel; l is the length of the time series.
The sigmoid function is chosen as the activation function in the last layer, and the output F C i;j is weighted and the weight α i is calculated.The obtained α i is weighted to the F C i;j row vector to obtain the new timing information v t .
Finally, f t is fused with v t to obtain the output value y t+1 at the final t+1 time.
Formula: ε is the output coefficient.The formulae for the variables in the LSTM network are shown below.

Long-and short-term memory networks
Formula: h t is the output of the current moment; σ and tanh are the activation functions.

Electricity load forecasting model based on CEEMDAN and TCN-LSTM
Due to the influence of temperature, humidity, and electricity price and other factors, the power load data has a certain nonlinearity and non-stationarity.In order to reduce the influence of raw load data on future prediction, a combination of signal decomposition and deep learning neural network modeling is used to forecast electricity load.The model proposed in this paper introduces the concept of sample entropy on the basis of existing prediction models based on modal decomposition.By reorganizing the original signal sequences and reconstructing them into new sequences with significant differences in complexity, the error of multiple predictions is reduced, and the prediction efficiency is also improved.Then the reorganized data sequences are learned and predicted by combining TCN network and LSTM network, which fully captures the effective features and dependencies between the data and makes the prediction results more reliable.
The TCN-LSTM model in this paper gives full play to the features of TCN network and LSTM network by inputting the raw data into TCN network for feature information extraction, and inputting the extracted time-series features into LSTM network for mining and learning the dependency relationship between the data in order to achieve the best prediction effect.The TCN-LSTM model of this paper is shown in

Case study
In order to validate the effectiveness of the CEEMDAN and TCN-LSTM methods proposed in this paper, taking into account the period of the power load data and the influencing factors, this paper selects the comprehensive power load data of New South Wales, Australia, from January to December 2010, with a sampling interval of 1h, and collects the power load, dry bulb temperature, wet bulb temperature, dew-point temperature, humidity, and tariff data for 24h a day, and divides the training set, the validation set, and the test set according to the ratio of 8:1:1.
For the collected power load, dry bulb temperature, wet bulb temperature, dew point temperature and tariff data, which are of different orders of magnitude and units, in order to avoid oversaturation of neurons and thus affecting the feature analysis, the read data set is first normalized to map the data into the range [0,1], which is calculated as follows: Formula: x is the input data; x min and x max are the minimum and maximum values of the data; and x p is the normalized data.
Meanwhile, in order to evaluate the prediction performance of the model in this paper more intuitively and accurately, this paper selects the mean absolute error (MAE), the root mean square error (RMSE) and the mean absolute percentage error (MAPE) as the error evaluation indexes, which are calculated by the following formulas: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Formula: n is the total sample size of electric load; y i is the predicted value of electric load; x i is the real value of electric load.

CEEMDAN sequence decomposition
In order to improve the prediction accuracy, the CEEMDAN algorithm was used to process the normalized data.The CEEMDAN parameters were set: the standard deviation of the added Gaussian white noise was 0.2 and the number of added noise was 300.
As shown in   1.
From Fig 8 and Table 1, it can be seen that the sample entropy values of IMF1 and IMF2 are both greater than 1.75 and the sample entropy values of these two subsequences are very close to each other, indicating that the complexity of these two sequences are relatively high and the probability of generating a new pattern is basically the same, so IMF1 and IMF2 are superimposed as a new subsequence to be trained and predicted, and call the new subsequence NIMF1; the sample entropy value of IMF3 is different from that of other The sample entropy value of IMF3 is closer to the sample entropy value of the other IMF components and its complexity is higher, so IMF3 is used as a separate subsequence for training and prediction, and the new subsequence is called NIMF2; the sample entropy value of IMF4 is closer to the sample entropy value of the residual component Res, and the difference is only 0.017, so IMF4 is superimposed on Res as a new subsequence for training and prediction and the new subsequence is called NIMF3.The sample entropy values of IMF5 and IMF6 are very close to each other, which means that the complexity of these two sequences is relatively high and the probability of generating a new pattern is basically the same, so IMF5 and IMF6 are superimposed as a new subsequence to be trained and predicted and called NIMF4; the sample entropy values of these four IMF components IMF7~IMF10 are relatively close to each other, so the four sequences are superimposed to get the new NIMF4; the sample entropy values of these four IMF components are relatively close to each other.These 4 sequences are superimposed to get

Analysis of projected results
The reorganized sequences NIMF1~NIMF5 are constructed according to the sliding window iterative prediction, and its sliding window size is set to 24, i.e., the 24 data of the previous 24h are taken as the features, while the power load data of the next hour are taken as the labels.By   The MAE, RMSE, and MAPE values as well as the prediction accuracies of each model prediction are given in Table 3.As can be seen from Table 3, compared with the single deep   2.31% respectively, which indicates that the prediction performance of the model is further improved after linear smoothing of the time-load sequence by CEEMDAN.Improvement.Meanwhile, the algorithm in this paper reduces the error accumulation caused by multiple predictions after introducing sample entropy for sequence complexity analysis and time series reconstruction, which also effectively improves the prediction accuracy of the CEEM-DAN-TCN-LSTM model in this paper.

Concluding remarks
In this paper, a short-term power load forecasting method based on CEEMDAN decomposition and TCN-LSTM network combination model is proposed for power load forecasting.The method firstly decomposes the power load data into sequences by CEEMDANJIAN, then reorganizes each sequence after decomposition into a new linear smooth sequence according to the sample entropy value, and then the reorganized sequences are de-trained with TCN-LSTM network that introduces the attention mechanism and forecasts the future power load.The algorithm in this paper is compared with existing signal decomposition and deep learning algorithms and the following conclusions are obtained: 1. Compared with the traditional signal decomposition method, the CEEMDAN decomposition method proposed in this paper decomposes the time series data more completely and better reduces the error of sequence reconstruction, and improves the prediction accuracy to some extent.
2. In this paper, sample entropy is introduced to analyze the complexity of each subsequence after CEEMDAN decomposition, and reconstructed into a new linear smooth sequence according to the sample entropy value of the individual sequences, which further improves the accuracy of prediction.
3. In this paper, the attention mechanism is introduced into TCN spatio-temporal convolutional network, so that TCN can better solve the interdependence between complex variables, and enhance the feature mining as well as learning ability of TCN network.
4. In this paper, a combined network model of TCN and LSTM is constructed, which is seen over the TCN network and LSTM network again, avoiding the inadequate extraction of potential features of the time series data by a single neural network, and thus improving the overall load prediction accuracy.
5. Compared with other common forecasting algorithms, this paper's algorithm can more accurately predict the changes in actual load data and perform better in the evaluation indexes, which can provide a certain reference for the power sector's electric power forecasting as well as electric energy scheduling.

Long
Short-Term Memory Network (LSTM) is an improved form of Recurrent Neural Network (RNN), which effectively solves the problems of gradient explosion as well as gradient vanishing that often occur in RNNs[26][27][28].Compared with the traditional RNN, LSTM has the ability to better capture and memorize long-term dependencies by introducing input gates, genetic gates, and output gates to control the transfer of information between them, and it can adaptively learn and capture the dependency information in sequence data.Its model structure is shown inFig 4 below.

Fig 4 .Fig 3 .
Fig 4. LSTM network model diagram.https://doi.org/10.1371/journal.pone.0300496.g004 Fig 5 below.The load forecasting model based on CEEMDAN and TCN-LSTM proposed in this paper is shown in Fig 6 below.The model prediction steps can be roughly divided into four steps: 1.Data preprocessing of the load data by means of the CEEMDAN decomposition algorithm, which decomposes the input data into a number of linearly smooth eigenmode functions IMFs and a residual component Res.

Fig 6 .
Fig 6.CEEMDAN and TCN-LSTM based model structure.https://doi.org/10.1371/journal.pone.0300496.g006 Fig 7 below, after CEEMDAN decomposition, the original time series is decomposed into 10 IMF components and one residual component Res.The IMF components are arranged in order of frequency from high to low, and each component is relatively stable without modal aliasing.

Fig 8 .
Fig 8. Sample entropy values for each IMF component.https://doi.org/10.1371/journal.pone.0300496.g008 feeding the training set into the TCN-LSTM network model of this paper for training, while the test set is used for the prediction of the next hour and the validation set is used to verify the accuracy of the prediction.The learning rate of the TCN-LSTM network is set to 0.01, the regularization dropout parameter is 0.1, the convolution kernel size is 3, the activation function is a sigmoid function, and the number of training rounds is 50.Fig 9 shows the loss value plot of the reconstructed sequence after training by the TCN-LSTM network model.As can be seen from Fig 9, this TCN-LSTM model converges faster at the same time the model does not have overfitting phenomenon, which indicates that the accuracy and generalization ability of the model is better.To further validate the performance of CEEMDAN-TCN-LSTM network and its accuracy in power load forecasting, based on the same training set, test set and validation set, respectively, SVR algorithm, TCN network model, LSTM network model, TCN-LSTM network model, and EMMD-TCN-LSTM (hereafter E-T-L) hybrid model are conducted with comparative experiments were conducted.In the experiment, 300 sampling points are selected to